AITopics | probit model

Collaborating Authors

probit model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Neural Collapse in Cumulative Link Models for Ordinal Regression: An Analysis with Unconstrained Feature Model

Neural Information Processing SystemsJun-19-2026, 21:41:58 GMT

A phenomenon known as "Neural Collapse (NC)" in deep classification tasks, in which the penultimate-layer features and the final classifiers exhibit an extremely simple geometric structure, has recently attracted considerable attention, with the expectation that it can deepen our understanding of how deep neural networks behave. The Unconstrained Feature Model (UFM) has been proposed to explain NC theoretically, and there emerges a growing body of work that extends NC to tasks other than classification and leverages it for practical applications. In this study, we investigate whether a similar phenomenon arises in deep Ordinal Regression (OR) tasks, via combining the cumulative link model for OR and UFM. We show that a phenomenon we call Ordinal Neural Collapse (ONC) indeed emerges and is characterized by the following three properties: (ONC1) all optimal features in the same class collapse to their within-class mean when regularization is applied; (ONC2) these class means align with the classifier, meaning that they collapse onto a one-dimensional subspace; (ONC3) the optimal latent variables (corresponding to logits or preactivations in classification tasks) are aligned according to the class order, and in particular, in the zero-regularization limit, a highly local and simple geometric relationship emerges between the latent variables and the threshold values. We prove these properties analytically within the UFM framework with fixed threshold values and corroborate them empirically across a variety of datasets. We also discuss how these insights can be leveraged in OR, highlighting the use of fixed thresholds.

artificial intelligence, deep learning, machine learning, (15 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

MLCBART: Multilabel Classification with Bayesian Additive Regression Trees

Tian, Jiahao, Chipman, Hugh, Loughin, Thomas

arXiv.org Machine LearningJan-15-2026

Multilabel Classification (MLC) deals with the simultaneous classification of multiple binary labels. The task is challenging because, not only may there be arbitrarily different and complex relationships between predictor variables and each label, but associations among labels may exist even after accounting for effects of predictor variables. In this paper, we present a Bayesian additive regression tree (BART) framework to model the problem. BART is a nonparametric and flexible model structure capable of uncovering complex relationships within the data. Our adaptation, MLCBART, assumes that labels arise from thresholding an underlying numeric scale, where a multivariate normal model allows explicit estimation of the correlation structure among labels. This enables the discovery of complicated relationships in various forms and improves MLC predictive performance. Our Bayesian framework not only enables uncertainty quantification for each predicted label, but our MCMC draws produce an estimated conditional probability distribution of label combinations for any predictor values. Simulation experiments demonstrate the effectiveness of the proposed model by comparing its performance with a set of models, including the oracle model with the correct functional form. Results show that our model predicts vectors of labels more accurately than other contenders and its performance is close to the oracle model. An example highlights how the method's ability to produce measures of uncertainty on predictions provides nuanced understanding of classification results.

artificial intelligence, correlation matrix, machine learning, (17 more...)

arXiv.org Machine Learning

2601.08964

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

Add feedback

Learning Correlated Reward Models: Statistical Barriers and Opportunities

Cherapanamjeri, Yeshwanth, Daskalakis, Constantinos, Farina, Gabriele, Mohammadpour, Sobhan

arXiv.org Machine LearningOct-20-2025

Random Utility Models (RUMs) are a classical framework for modeling user preferences and play a key role in reward modeling for Reinforcement Learning from Human Feedback (RLHF). However, a crucial shortcoming of many of these techniques is the Independence of Irrelevant Alternatives (IIA) assumption, which collapses \emph{all} human preferences to a universal underlying utility function, yielding a coarse approximation of the range of human preferences. On the other hand, statistical and computational guarantees for models avoiding this assumption are scarce. In this paper, we investigate the statistical and computational challenges of learning a \emph{correlated} probit model, a fundamental RUM that avoids the IIA assumption. First, we establish that the classical data collection paradigm of pairwise preference data is \emph{fundamentally insufficient} to learn correlational information, explaining the lack of statistical and computational guarantees in this setting. Next, we demonstrate that \emph{best-of-three} preference data provably overcomes these shortcomings, and devise a statistically and computationally efficient estimator with near-optimal performance. These results highlight the benefits of higher-order preference data in learning correlated utilities, allowing for more fine-grained modeling of human preferences. Finally, we validate these theoretical guarantees on several real-world datasets, demonstrating improved personalization of human preferences.

artificial intelligence, machine learning, royal tenenbaum, (18 more...)

arXiv.org Machine Learning

2510.15839

Country:

North America > United States (0.93)
Europe (0.92)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Leisure & Entertainment (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Estimating Conditional Covariance between labels for Multilabel Data

Park, Laurence A. F., Read, Jesse

arXiv.org Artificial IntelligenceAug-27-2025

Multilabel data should be analysed for label dependence before applying multilabel models. Independence between multilabel data labels cannot be measured directly from the label values due to their dependence on the set of covariates $\vec{x}$, but can be measured by examining the conditional label covariance using a multivariate Probit model. Unfortunately, the multivariate Probit model provides an estimate of its copula covariance, and so might not be reliable in estimating constant covariance and dependent covariance. In this article, we compare three models (Multivariate Probit, Multivariate Bernoulli and Staged Logit) for estimating the constant and dependent multilabel conditional label covariance. We provide an experiment that allows us to observe each model's measurement of conditional covariance. We found that all models measure constant and dependent covariance equally well, depending on the strength of the covariance, but the models all falsely detect that dependent covariance is present for data where constant covariance is present. Of the three models, the Multivariate Probit model had the lowest error rate.

artificial intelligence, covariance, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.18951

Country: Europe (0.68)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Add feedback

Neural Collapse in Cumulative Link Models for Ordinal Regression: An Analysis with Unconstrained Feature Model

Ma, Chuang, Obuchi, Tomoyuki, Tanaka, Toshiyuki

arXiv.org Machine LearningJun-9-2025

A phenomenon known as ''Neural Collapse (NC)'' in deep classification tasks, in which the penultimate-layer features and the final classifiers exhibit an extremely simple geometric structure, has recently attracted considerable attention, with the expectation that it can deepen our understanding of how deep neural networks behave. The Unconstrained Feature Model (UFM) has been proposed to explain NC theoretically, and there emerges a growing body of work that extends NC to tasks other than classification and leverages it for practical applications. In this study, we investigate whether a similar phenomenon arises in deep Ordinal Regression (OR) tasks, via combining the cumulative link model for OR and UFM. We show that a phenomenon we call Ordinal Neural Collapse (ONC) indeed emerges and is characterized by the following three properties: (ONC1) all optimal features in the same class collapse to their within-class mean when regularization is applied; (ONC2) these class means align with the classifier, meaning that they collapse onto a one-dimensional subspace; (ONC3) the optimal latent variables (corresponding to logits or preactivations in classification tasks) are aligned according to the class order, and in particular, in the zero-regularization limit, a highly local and simple geometric relationship emerges between the latent variables and the threshold values. We prove these properties analytically within the UFM framework with fixed threshold values and corroborate them empirically across a variety of datasets. We also discuss how these insights can be leveraged in OR, highlighting the use of fixed thresholds.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Machine Learning

2506.05801

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Bayesian Regression for Predicting Subscription to Bank Term Deposits in Direct Marketing Campaigns

Tanvir, Muhammad Farhan, Hossain, Md Maruf, Jishan, Md Asifuzzaman

arXiv.org Artificial IntelligenceOct-28-2024

In the highly competitive environment of the banking industry, it is essential to precisely forecast the behavior of customers in order to maximize the effectiveness of marketing initiatives and improve financial consequences. The purpose of this research is to examine the efficacy of logit and probit models in predicting term deposit subscriptions using a Portuguese bank's direct marketing data. There are several demographic, economic, and behavioral characteristics in the dataset that affect the probability of subscribing. To increase model performance and provide an unbiased evaluation, the target variable was balanced, considering the inherent imbalance in the dataset. The two model's prediction abilities were evaluated using Bayesian techniques and Leave-One-Out Cross-Validation (LOO-CV). The logit model performed better than the probit model in handling this classification problem. The results highlight the relevance of model selection when dealing with complicated decision-making processes in the financial services industry and imbalanced datasets. Findings from this study shed light on how banks can optimize their decision-making processes, improve their client segmentation, and boost their marketing campaigns by utilizing machine learning models.

artificial intelligence, machine learning, regression, (15 more...)

arXiv.org Artificial Intelligence

2410.21539

Country:

Europe > Germany > Brandenburg > Potsdam (0.05)
Asia > Indonesia (0.04)
Asia > India > Goa (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Financial Services (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Thinning Measurement Models and Questionnaire Design Ricardo Silva Department of Statistical Science University College London Gower Street, London WC1E 6BT ricardo@stats.ucl.ac.uk

Neural Information Processing SystemsMar-15-2024, 09:58:22 GMT

Inferring key unobservable features of individuals is an important task in the applied sciences. In particular, an important source of data in fields such as marketing, social sciences and medicine is questionnaires: answers in such questionnaires are noisy measures of target unobserved features. While comprehensive surveys help to better estimate the latent variables of interest, aiming at a high number of questions comes at a price: refusal to participate in surveys can go up, as well as the rate of missing data; quality of answers can decline; costs associated with applying such questionnaires can also increase. In this paper, we cast the problem of refining existing models for questionnaire data as follows: solve a constrained optimization problem of preserving the maximum amount of information found in a latent variable model using only a subset of existing questions. The goal is to find an optimal subset of a given size. For that, we first define an information theoretical measure for quantifying the quality of a reduced questionnaire. Three different approximate inference methods are introduced to solve this problem. Comparisons against a simple but powerful heuristic are presented.

indicator, latent variable, questionnaire, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Essex > Colchester (0.04)

Genre: Questionnaire & Opinion Survey (1.00)

Industry: Health & Medicine > Health Care Providers & Services (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Add feedback

Expectation propagation for the smoothing distribution in dynamic probit

Anceschi, Niccolò, Fasano, Augusto, Rebaudo, Giovanni

arXiv.org Machine LearningSep-4-2023

The smoothing distribution of dynamic probit models with Gaussian state dynamics was recently proved to belong to the unified skew-normal family. Although this is computationally tractable in small-to-moderate settings, it may become computationally impractical in higher dimensions. In this work, adapting a recent more general class of expectation propagation (EP) algorithms, we derive an efficient EP routine to perform inference for such a distribution. We show that the proposed approximation leads to accuracy gains over available approximate algorithms in a financial illustration.

approximation, probit model, propagation, (12 more...)

arXiv.org Machine Learning

2309.01641

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.05)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Efficient computation of predictive probabilities in probit models via expectation propagation

Fasano, Augusto, Anceschi, Niccolò, Franzolini, Beatrice, Rebaudo, Giovanni

arXiv.org Machine LearningSep-4-2023

Binary regression models represent a popular model-based approach for binary classification. In the Bayesian framework, computational challenges in the form of the posterior distribution motivate still-ongoing fruitful research. Here, we focus on the computation of predictive probabilities in Bayesian probit models via expectation propagation (EP). Leveraging more general results in recent literature, we show that such predictive probabilities admit a closed-form expression. Improvements over state-of-the-art approaches are shown in a simulation study.

artificial intelligence, machine learning, predictive probability, (16 more...)

arXiv.org Machine Learning

2309.0163

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.06)
North America > United States (0.05)
Asia > Singapore (0.05)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Efficient expectation propagation for posterior approximation in high-dimensional probit models

Fasano, Augusto, Anceschi, Niccolò, Franzolini, Beatrice, Rebaudo, Giovanni

arXiv.org Machine LearningSep-4-2023

Bayesian binary regression is a prosperous area of research due to the computational challenges encountered by currently available methods either for high-dimensional settings or large datasets, or both. In the present work, we focus on the expectation propagation (EP) approximation of the posterior distribution in Bayesian probit regression under a multivariate Gaussian prior distribution. Adapting more general derivations in Anceschi et al. (2023), we show how to leverage results on the extended multivariate skew-normal distribution to derive an efficient implementation of the EP routine having a per-iteration cost that scales linearly in the number of covariates. This makes EP computationally feasible also in challenging high-dimensional settings, as shown in a detailed simulation study.

approximation, artificial intelligence, implementation, (16 more...)

arXiv.org Machine Learning

2309.01619

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.05)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback